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Abstract — Dirty paper coding (DPC) refers to methods for 
pre-subtraction of known interference at the transmitter of a 
multiuser communication system. There are numerous applica- 
tions for DPC, including coding for broadcast channels. Recently, 
lattice-based coding techniques have provided several designs for 
DPC. In lattice-based DPC, there are two codes - a convolutional 
code that defines a lattice used for shaping and an error 
correction code used for channel coding. Several specific designs 
have been reported in the recent literature using convolutional 
and graph-based codes for capacity-approaching shaping and 
coding gains. In most of the reported designs, either the encoder 
works on a joint trellis of shaping and channel codes or the 
decoder requires iterations between the shaping and channel 
decoders. This results in high complexity of implementation. In 
this work, we present a lattice-based DPC scheme that provides 
good shaping and coding gains with moderate complexity at both 
the encoder and the decoder. We use a convolutional code for 
sign-bit shaping, and a low-density parity check (LDPC) code 
for channel coding. The crucial idea is the introduction of a one- 
codeword delay and careful parsing of the bits at the transmitter, 
which enable an LDPC decoder to be run first at the receiver. 
This provides gains without the need for iterations between the 
shaping and channel decoders. Simulation results confirm that at 
high rates the proposed DPC method performs close to capacity 
with moderate complexity. As an application of the proposed DPC 
method, we show a design for superposition coding that provides 
rates better than time-sharing over a Gaussian broadcast channel. 



I. Introduction 

Situations where interference is known non-causally at 
the transmitter but not at the receiver model several useful 
multiuser communication scenarios. In ll j, Costa introduced 
and studied coding for such situations and called it "writing on 
dirty paper". Dirty paper coding (DPC) is now recognized as a 
powerful notion central to approaching capacity on multiuser 
channels. 

Lattice-based ideas for DPC were suggested and shown to 
be capacity-approaching in Q, ||3|. Recently, many designs 
of lattice-based DPC schemes have been proposed in Q-ljS). 
Lattice-based schemes typically use cosets of a convolutional 
code for lattice-quantizing or shaping to minimize the energy 
of the difference of the coded symbols and the interfering 
symbols. A part of the message bits is used to choose the 
specific coset used in the minimization. In addition to the 
shaping convolutional code, an error correction code needs 
to be used to obtain coding gain and approach capacity. 



The main source of complexity in lattice-based DPC designs 
is combining shaping and coding encoders/decoders at the 
transmitter/receiver Simple concatenation schemes are not 
applicable because of the following reasons - outer shaping 
followed by inner coding results in unshaped parity symbols 
that increase transmitted energy, while outer coding followed 
by inner shaping results in a poor inner code that needs to be 
iteratively decoded at the receiver with the outer code. 

In |[6|, encoding is done on a combined trellis of the source 
code (Turbo TCQ) and a channel code (Turbo TCM). At the 
receiver, decoding is done for Turbo TCM followed by syn- 
drome computation to recover message bits. The transmitter 
is complex in |6| because of the use of the joint trellis. The 
DPC method proposed in |7| is similar to that of |^. In ||3J, 
multilevel coding is used, and there are different codes for 
different bits of the symbols. At the receiver, iterations have 
to be performed between decoders for some of the channel 
codes and the shaping decoder In ||8) and Q, shaping follows 
channel coding and the receiver performs iterations between 
the shaping and channel decoders. 

In this work, we propose a lattice-based method that uses a 
novel combination of a convolutional code for sign-bit shaping 
and a low density parity check (LDPC) code for channel 
coding. As shown in specific designs and simulations, the 
method provides good shaping and coding gains at moderate 
complexity. The main idea for reducing complexity at the 
receiver is the introduction of a one-codeword delay at the 
transmitter, and the shaping of symbols from current message 
bits combined with parity bits from the previous codeword. 
This enables the LDPC decoder to be run first at the receiver 
(with a one-codeword delay) without any need for iterations 
with a shaping decoder As an application, we use the proposed 
DPC method to design codes for superposition coding in two- 
user Gaussian broadcast channels. By simulations, we show 
that rate points outside the time-sharing region are achieved. 

The rest of the paper is organized as follows. After a brief 
review of the lattice-based DPC coding method in Section 



III 



[n] we present the proposed DPC method in Section 
This is followed by description and simulation of specific 
designs of DPC codes in Section |IV] In Section |V] design 
of a superposition scheme using the proposed DPC method 
is described and simulation results are presented. Concluding 
remarks are made in Section fVll 



II. Lattice Dirty Paper Codes 



III. Proposed Scheme 



In a Gaussian dirty paper channel, the received symbol 
vector Y = [Yi 12 • • • Yn] is modeled as 

Y = X + S + N, 

where X = [Xi X2 ■ ■ ■ Xn] denotes the transmitted vector, 
S = [5*1 5*2 •• • Sn] denotes the interfering vector assumed 
to be known non-causally at the transmitter and N denotes 
the additive Gaussian noise vector. The transmit power is 
assumed to be upper-bounded by -£'[|Xp] < Px per symbol, 
and the interference power is denoted -i?[|Sp] — Ps per 
symbol. The noise variance per symbol is denoted P/y. In 
|[T], Costa shows that the capacity of the dirty paper channel 
is |log(^l + js^^ i.e. known interference can be canceled 
perfectly at the transmitter. 

The interfering vector S is used as an input in the encoding 
process and plays an important role to determine a suitable 
transmit vector X. A coding strategy for choosing X needs 
to overcome the imminent addition of S and protect the 
transmitted information from the addition of the noise N. Such 
coding strategies are called dirty paper coding (DPC) methods. 

In 1 3 1, a dirty paper coding (DPC) scheme based on 
lattice strategies was proposed and shown to achieve the 
capacity of the dirty paper channel. We follow |4| for a 
brief review of the transmitter and receiver structure in the 
lattice DPC method |3|. Let A denote an n-dimensional 
lattice with fundamental Voronoi region v having averaged 
second moment P (A) = Px and normalized second moment 
G(A). Also let U ~ Unif(j/) i.e. U is a random variable 
(dither) uniformly distributed over ly. The lattice transmission 
approach of |j3J ||4J is as follows. 

• Transmitter: The input alphabet X is restricted to i/. For 
any v G i^, the encoder sends 

X=[v-aS-U]mod A, (1) 

where a = p^J^p^ is a MMSE scaling factor 

• Receiver. The receiver computes 

Y' = [aY + U] mod A (2) 

The channel from v to Y' defined by (1) and (2) is equivalent 
in distribution to 



where 



Y' = [v+ N'] mod A, 



N' = [{l-a)V + aN] mod A. 



(3) 



(4) 



Lower bounds on achievable rates for the above equivalent 
channel is shown in Q to be equal to 

/ {V-X) > I log2 (1 + SNR) - i log2 (2^eG (A)) . (5) 

For optimal shaping, G (A) — > 2^ and we approach capacity 
of the dirty paper channel. Note that the dither is assumed to 
be known at the transmitter and receiver (say, through the use 
of a common seed in a random number generator). 



The proposed scheme uses a convolutional code for sign- 
bit shaping |10| and low density parity check (LDPC) codes 
for channel coding. We assume a A/-PAM signal constellation 
with a carefully chosen bit-to-symbol mapping that is compati- 
ble with sign-bit shaping and bit-interleaved coded modulation 
(BICM) [11 1. For M = 16, the constellation and mapping are 
shown in Fig. [T] The mapping in Fig. [T] is suited for sign- 
bit shaping, since a flip of the most significant bit results in a 
significant change in symbol value for all possible 4-bit inputs. 
Also, the mapping is mostly Gray except for a few symbol 
transitions. Gray mapping is known to be the most effective 
mapping for BICM with LDPC codes. This heuristic choice of 
mapping enables the possibility of good shaping and coding 
gains to be obtained simultaneously. As expected, larger values 
of M will result in larger shaping gains in our design, and 
we stick to the 16- RAM shown in Fig. [T]for illustration and 
simulation. 

A. Encoder Structure 

The encoder structure for the proposed scheme is as 
shown in Fig. [2] We describe the operations in the en- 
coder at time step T or in the T-th block. A fc-bit mes- 
sage m = [mi 7712 • • • rrik] is encoded into a s-symbol 
vector u = [ui U2 - ■ ■ Ug] from the M-PAM constellation 
A = {-(M- l)/2, - •• ,-1/2,1/2, •• • ,(M- l)/2}, where 
s — J' is assumed to be an integer. Let I — log2 M and 
let /m ■ {0, 1}' — > A denote the bit-to-symbol mapping. The 
bits that map to the i-th symbol are denoted Zia2ia3i ■ ■ ■ an; 
the sign-bit vector is denoted z — [zi Z2 ■ ■ ■ Zg], and we define 
vectors = [aj-^ '^^2 ' ' ' "^is] for 2 < j < L Finally, we have 
V = /M(za2 • • ■ where /m operates component-wise on 
a vector input. 

Let us assume that the vectors aj, 2 < j < I are available 
at the encoder. The sign-bit shaping convolutional code is 
used to determine the sign-bit vector z as follows. A part 
of the message m' = [mi 1712 ■ ■ ■ ruk'] with k' < k bits is 
mapped to a coset leader of the convolutional code using an 
inverse syndrome former flOj . Note that we need the rate of 
the convolutional code to be 1 — k' /s. Let the coset chosen by 
m' be denoted C(m'). The sign-bit vector z is chosen from 
C(m') so as to minimize the squared sum (energy) of the 

is the MMSE 



vector (v — aS) mod M, where a = „ .„ 
factor and S is the interference vector. That is. 



z = arg min |(/A/(ua2 • • • a;) — aS) mod M\ . (6) 

ueC(m') 

The minimization in (j6|l is implemented using the Viterbi 
algorithm 1 10|. 

The aj,2 < j < I are determined as follows. An {n,k — k' + 
s) LDPC code is used at the encoder with a systematic encoder 
E : {0, 1}*-'='+^ ^ {0, 1}". Let m" = [z ruk'+i • • • m^] be 
input to the systematic LDPC encoder to obtain the codeword 
E{m") = [m" pt], where px is the parity-bit vector for the 
T-th block. The parity-bit vector is delayed by one time step. 
For the T-th block, the n — s — s{l — l) bits in [ruk'+i ■ ■ ■ m^] 



1000 1001 1011 1010 1110 1111 1101 1100 0000 0001 0011 0010 0110 0111 0101 0100 



-15/2 -13/2 -11/2 -9/2 -7/2 -5/2 -3/2 -1/2 1/2 3/2 5/2 7/2 9/2 11/2 13/2 15/2 



Fig. 1. 16-PAM constellation. 
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Fig. 2. Encoder structure. 



and pt-1 are rearranged by a permutation 11 to form the 
vectors aj, 2 < j < I. This permutation is necessary in an 
implementation of BICM |11|. 

Note that both the shaping and coding objectives have 
been met at the encoder. The transmitted symbols v — aS 
mod M have minimal energy in the lattice defined by sign- 
bit shaping using the convolutional code. Selected bits in 
successive blocks of symbols form codewords of the LDPC 
code. In summary, the encoder structure achieves DPC shaping 
and LDPC coding with bit-interleaved modulation. 

B. Decoder Structure 

The decoder for the proposed scheme is as shown in Fig|3] 
The demapper computes log likelihood ratios (LLRs) for the 
bits from the received symbols in Y = aY + U. The LLRs of 
the {k — k') message bits after a delay of one time step, and 
the LLRs of the n~ {k — k' + s) parity bits are de-interleaved. 
The s = i^g^i^f LLRs of the sign bits after a delay on one time 
step, and the n — s output LLRs of the de-interleaver are given 
as the input to the LDPC decoder. The LDPC decoder outputs 
k — k' message bits and s bits of the sign bit vector of the 
previous block. Now, the s-bit sign vector is passed through 
the syndrome former to recover the remaining k' message bits. 

The demapper function at the receiver has to calculate LLRs 
taking into account the modulo M operation at the encoder 



f4l. Therefore, the received constellation Ar is a replicated 
version of the A/-PAM constellation A used at the transmitter 
(assuming that scaling factors have been corrected at the 
receiver). That is, 

Ar^{A- rM, ■■■ ,A- M,A,A + M,-- - ,A + rM}. 

The number of replications r is chosen so that the average 
power of Aji is approximately equal to the total average power 
Px + Ps, and the bit mapping of the symbol a + jM (a E 
A,l < j < r) is the same as that for a. The LLR for the 
i-th bit in the j-th symbol Yj is computed according to the 
constellation An using the following formula: 



/ 



J2 exp 

aGAff.hit i=0 



1 \ Yj-a 



2 aP, 



N 



E 



cxp 



1 l^J -« 



2 aP, 



N 



Since the constellation mapping is nearly Gray, iterations 
with the demapper do not provide significant improvements 
in coding gain, particularly for large M. 
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Fig. 3. Decoder structure. 



IV. Simulation Results 

For simulations, we have taken n — 40000, k — 30000, 
k' = 5000 with M = 16; this results in s = 10000. 
The constellation mapping is as given in Fig. [T] We 
have chosen a rate- 1/2 memory 8 (256 state) non- 
systematic convolutional code with generator polynomials 
(D^ + + + + D + 1, + D'^ + + + l) 
as the sign-bit shaping code. A non-systematic convolutional 
code is used to avoid error propagation problems. 

A randomly constructed irregular LDPC code (40000, 
35000) of rate 7/8 with variable node degree distribution: 
0.1256X + 0.7140a;^ + 0.1604a;^ and check node degree dis- 
tribution x^^ is used as the channel code. The overall rate of 
transmission is seen to be 'Iqqqq x 4 = 3 bits per channel 
use. Fig. |4] shows BER plots over an AWGN channel and a 
DPC channel with interference known at the transmitter. The 



plot for illustration. We see that a BER of 10 ^ is achieved at 
a SNR of 19.45dB with interference, and at a SNR of 19.33 
dB without interference. We have simulated 1000 blocks of 
length 40000 to obtain sufficient statistics for a BER of 10"^. 

The AWGN capacity at an SNR of 101ogio(26-l) = 17.99 
dB for a rate of 3 bits per channel use. This shows that we 
are 1.46 dB away from ideal dirty paper channel capacity. 
The granular gain G(A) — 2*-^ /QSx is computed from the 
simulations to be 1.282dB |4|, where C* — 3.5 is the rate 
before channel coding, and Sx is the transmit power (obtained 
through simulations). From this, the shaping loss is calculated 
as follows: 



lOlog 



27reG (A) 2^^* - 1 
10 - 1 



0.2548 dB. 



(7) 



So, of the total gap of 1.46 dB, we have a shaping gap of 
0.2548dB, and a coding gap of 1.2052dB to capacity. 
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Fig. 4. BER plot for DPC and AWGN. 

interfering vector was generated at random for different power 
levels. The plot with interference did not change appreciably 
for all power levels of interference, and we have provided one 



V. Application to Gaussian Broadcast Channel 

We use the proposed scheme for superposition coding in 
a two-user Gaussian broadcast channel Yi = X + Ni and 
Y2 = X + N2 with Pn, > Pn2- We let Px, = (l - (^) P 
and Px2 = PP, where P is the total transmit power. Here, 
User 2 is coded using DPC considering User 1 as interference. 
User 1 is shaped using sign-bit shaping and coded using an 
LDPC code over M-PAM. Fig. |5] shows a block diagram of 
the transmitter and receivers. The encoder structure for User 
1 is as in Fig. [2] with the interference vector S = 0. Hence, 
for User 1, the shaping coder minimizes the energy of v. The 
demapper at Receiver 1 calculates LLR for the i-th bit in the 
j-th receiver symbol Yij using the following formula. 

aeA:bhi=l L 



2 I3P + Pn, 



where p{a) for a G A represents the a priori probability of the 
A/-PAM symbol a. At the receiver, we approximate pi using 
a Gaussian distribution with variance P5 assuming that the 
distribution of M-PAM symbols is approximately Gaussian. 
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Fig. 5. Block diagram for a two-user Gaussian broadcast channel. 



We simulated a two user degraded broadcast channel with 
Pni — 0.9 and P^Va = 0.09 using the proposed scheme with 
parameters from Section |IV] The total transmit power, power 
for User 1 and power for User 2 required for a bit error rate of 
10~^ (at both receivers) are estimated from the simulation and 
denoted P, P^^ and Px2^ respectively. The SNR for Receiver 
1 is computed as lOlogio ( p^^+p„ ) = 19.1791 dB. Since 
DPC is done for User 2^ the effective SNR at Receiver 2 
is computed as lOlogj^Q ' 



p I — 19.4574 dB. Comparison 
with the SNR needed for a single user capacity of 3 bits per 
channel use (which is 17.99 dB) shows that the total loss for 
both the users is about 2.4642dB. Fig. |6] shows the (3, 3) rate 
pair in the capacity region of the two-user Gaussian broadcast 
channel with total transmit power P and noise power Pn^, 
Pn2, which is defined by i?i < 1 log (^1 + jp+pf^ ) . i?2 < 

for < /3 < 1. We see that the (3,3) rate 



1 



point is clearly outside the time-sharing region. 

VI. Conclusions 

In this work, we have proposed a method for designing 
lattice-based schemes for dirty paper coding using sign-bit 
shaping and LDPC codes. Simulation results show that the 
proposed design performs 1.46dB away from the dirty paper 
capacity for a block length of n = 40000 at the rate of 
3 bits/channel use. This performance is comparable to other 
results in the literature. However, as discussed in this article, 
a novel method for combining shaping and coding results 
in good gains at lesser complexity in our design, when 
compared to other lattice-based strategies. As an application, 
we have designed a superposition coding scheme for Gaussian 
broadcast channels that is shown to perform better than time- 
sharing through simulations. 

Out of the 1.46 dB gap to capacity, about 1.2 dB is gap 
attributed to a sub-optimal choice of the LDPC code. Opti- 
mizing the LDPC code will require use of genetic algorithms 
and asymmetric density evolution |12|, which are topics for 
future work. 
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Fig. 6. Two-user Gaussian broadcast channel capacity region. 
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